109 research outputs found

    Statistical parsing of noun phrase structure

    Get PDF
    Noun phrases (NPs) are a crucial part of natural language, exhibiting in many cases an extremely complex structure. However, NP structure is largely ignored by the statistical parsing field, as the most widely-used corpus is not annotated with it. This lack of gold-standard data has restricted all previous efforts to parse NPs, making it impossible to perform the supervised experiments that have achieved high performance in so many Natural Language Processing (NLP) tasks. We comprehensively solve this problem by manually annotating NP structure for the entire Wall Street Journal section of the Penn Treebank. The inter-annotator agreement scores that we attain refute the belief that the task is too difficult, and demonstrate that consistent NP annotation is possible. Our gold-standard NP data is now available and will be useful for all parsers. We present three statistical methods for parsing NP structure. Firstly, we apply the Collins (2003) model, and find that its recovery of NP structure is significantly worse than its overall performance. Through much experimentation, we determine that this is not a result of the special base-NP model used by the parser, but primarily caused by a lack of lexical information. Secondly, we construct a wide-coverage, large-scale NP Bracketing system, applying a supervised model to achieve excellent results. Our Penn Treebank data set, which is orders of magnitude larger than those used previously, makes this possible for the first time. We then implement and experiment with a wide variety of features in order to determine an optimal model. Having achieved this, we use the NP Bracketing system to reanalyse NPs outputted by the Collins (2003) parser. Our post-processor outperforms this state-of-the-art parser. For our third model, we convert the NP data to CCGbank (Hockenmaier and Steedman, 2007), a corpus that uses the Combinatory Categorial Grammar (CCG) formalism. We experiment with a CCG parser and again, implement features that improve performance. We also evaluate the CCG parser against the Briscoe and Carroll (2006) reannotation of DepBank (King et al., 2003), another corpus that annotates NP structure. This supplies further evidence that parser performance is increased by improving the representation of NP structure. Finally, the error analysis we carry out on the CCG data shows that again, a lack of lexicalisation causes difficulties for the parser. We find that NPs are particularly reliant on this lexical information, due to their exceptional productivity and the reduced explicitness present in modifier sequences. Our results show that NP parsing is a significantly harder task than parsing in general. This thesis comprehensively analyses the NP parsing task. Our contributions allow wide-coverage, large-scale NP parsers to be constructed for the first time, and motivate further NP parsing research for the future. The results of our work can provide significant benefits for many NLP tasks, as the crucial information contained in NP structure is now available for all downstream systems

    Exploring Agricultural Production Systems and Their Fundamental Components with System Dynamics Modelling

    Get PDF
    Agricultural production in the United States is undergoing marked changes due to rapid shifts in consumer demands, input costs, and concerns for food safety and environmental impact. Agricultural production systems are comprised of multidimensional components and drivers that interact in complex ways to influence production sustainability. In a mixed-methods approach, we combine qualitative and quantitative data to develop and simulate a system dynamics model that explores the systemic interaction of these drivers on the economic, environmental and social sustainability of agricultural production. We then use this model to evaluate the role of each driver in determining the differences in sustainability between three distinct production systems: crops only, livestock only, and an integrated crops and livestock system. The result from these modelling efforts found that the greatest potential for sustainability existed with the crops only production system. While this study presents a stand-alone contribution to sector knowledge and practice, it encourages future research in this sector that employs similar systems-based methods to enable more sustainable practices and policies within agricultural production

    Factors affecting the determination of threshold doses for allergenic foods: How much is too much?

    Get PDF
    Background: Ingestion of small amounts of an offending food can elicit adverse reactions in individuals with IgE-mediated food allergies. The threshold dose for provocation of such reactions is often considered to be zero. However, because of various practical limitations in food production and processing, foods may occasionally contain trace residues of the offending food. Are these very low, residual quantities hazardous to allergic consumers? How much of the offending food is too much? Very little quantitative information exists to allow any risk assessments to be conducted by the food industry. Objective: We sought to determine whether the quality and quantity of existing clinical data on threshold doses for commonly allergenic foods were sufficient to allow consensus to be reached on establishment of threshold doses for specific foods. Methods: In September 1999,12 clinical allergists and other interested parties were invited to participate in a roundtable conference to share existing data on threshold doses and to discuss clinical approaches that would allow the acquisition of that information. Results: Considerable data were identified in clinical files relating to the threshold doses for peanut, cows\u27 milk, and egg; limited data were available for other foods, such as fish and mustard. Conclusions: Because these data were often obtained by means of different protocols, the estimation of a threshold dose was very difficult. Development of a standardized protocol for clinical experiments to allow determination of the threshold dose is needed

    Convection: the likely source of the medium-scale gravity waves observed in the OH airglow layer near Brasilia, Brazil, during the SpreadFEx campaign

    Get PDF
    Six medium-scale gravity waves (GWs) with horizontal wavelengths of λH=60–160 km were detected on four nights by Taylor et al. (2009) in the OH airglow layer near Brasilia, at 15° S, 47° W, during the Spread F Experiment (SpreadFEx) in Brazil in 2005. We reverse and forward ray trace these GWs to the tropopause and into the thermosphere using a ray trace model which includes thermospheric dissipation. We identify the convective plumes, convective clusters, and convective regions which may have generated these GWs. We find that deep convection is the highly likely source of four of these GWs. We pinpoint the specific deep convective plumes which likely excited two of these GWs on the nights of 30 September and 1 October. On these nights, the source location/time uncertainties were small and deep convection was sporadic near the modeled source locations. We locate the regions containing deep convective plumes and clusters which likely excited the other two GWs. The last 2 GWs were probably also excited from deep convection; however, they must have been ducted ~500–700 km if so. Two of the GWs were likely downwards-propagating initially (after which they reflected upwards from the Earth\u27s surface), while one of the GWs was likely upwards-propagating initially from the convective plume/cluster. We also estimate the amplitudes and vertical scales of these waves at the tropopause, and compare their scales with those from a simple, linear convection model. Finally, we calculate each GW\u27s dissipation altitude, location, and amplitude. We find that the dissipation altitude depends sensitively on the winds at and above the OH layer. We also find that several of these GWs may have penetrated to high enough altitudes to potentially seed equatorial spread F (ESF) if located somewhat farther from the magnetic equator

    The spread-F Experiment (SpreadFEx): Program overview and first results

    Get PDF
    We performed an extensive experimental campaign (the spread F Experiment, or SpreadFEx) from September to November 2005 to attempt to define the role of neutral atmosphere dynamics, specifically wave motions propagating upward from the lower atmosphere, in seeding equatorial spread F and plasma bubbles extending to higher altitudes. Campaign measurements focused on the Brazilian sector and included ground-based optical, radar, digisonde, and GPS measurements at a number of fixed and temporary sites. Related data on convection and plasma bubble structures were also collected by GOES 12 and the GUVI instrument aboard the TIMED satellite. Initial results of our analyses of SpreadFEx and related data indicate 1) extensive gravity wave (GW) activity apparently linked to deep convection predominantly to the west of our measurement sites, 2) the presence of small-scale GWactivity confined to lower altitudes, 3) larger-scaleGWactivity apparently penetrating to much higher altitudes suggested by electron density and TEC fluctuations in the E and F regions, 4) substantial GW amplitudes implied by digisonde electron densities, and 5) apparent direct links of these perturbations in the lower F region to spread F and plasma bubbles extending to much higher altitudes. Related efforts with correlative data are defining 6) the occurrence and locations of deep convection, 7) the spatial and temporal evolutions of plasma bubbles, the 8) 2D (height-resolved) structures of plasma bubbles, and 9) the expected propagation of GWs and tides from the lower atmosphere into the thermosphere and ionosphere

    Identifying challenges and opportunities for improved nutrient management through U.S.D.A's Dairy Agroecosystem Working Group

    Get PDF
    Nutrient management is a priority of U.S. dairy farms, although specific concerns vary across regions and management systems. To elucidate challenges and opportunities to improving nutrient use efficiencies, the USDA’s Dairy Agroecosystems Working Group investigated 10 case studies of confinement (including open lots and free stall housing) and grazing operations in the seven major U.S. dairy producing states. Simulation modeling was carried out using the Integrated Farm Systems Model over 25 years of historic weather data. Dairies with a preference for importing feed and exporting manure, common for simulated dry lot dairies of the arid west, had lower nutrient use efficiencies at the farm gate than freestall and tie-stall dairies in humid climates. Phosphorus (P) use efficiencies ranged from 33 to 82% of imported P, while N use efficiencies were 25 to 50% of imported N. When viewed from a P budgeting perspective, environmental losses of P were generally negligible, especially from dry lot dairies. Opportunities for greater P use efficiency reside primarily in increasing on-farm feed production and reducing excess P in diets. In contrast with P, environmental losses of nitrogen (N) were 50 to 75% of annual farm N inputs. For dry lot dairies, the greatest potential for N conservation is associated with ammonia (NH3) control from housing, whereas for freestall and tie-stall operations, N conservation opportunities vary with soil and manure management system. Given that fertilizer expenses are equivalent to 2 to 6% of annual farm profits, cost incentives do exist to improve nutrient use efficiencies. However, augmenting on-farm feed production represents an even greater opportunity, especially on large operations with high animal unit densities

    Association of Carotid Plaque Lp-PLA2 with Macrophages and Chlamydia pneumoniae Infection among Patients at Risk for Stroke

    Get PDF
    BACKGROUND: We previously showed that the burden of Chlamydia pneumoniae in carotid plaques was significantly associated with plaque interleukin (IL)-6, and serum IL-6 and C-reactive protein (CRP), suggesting that infected plaques contribute to systemic inflammatory markers in patients with stroke risk. Since lipoprotein-associated phospholipase A2 (Lp-PLA(2)) mediates inflammation in atherosclerosis, we hypothesized that serum Lp-PLA(2) mass and activity levels and plaque Lp-PLA(2) may be influenced by plaque C. pneumoniae infection. METHODOLOGY/PRINCIPAL FINDINGS: Forty-two patients underwent elective carotid endarterectomy. Tissue obtained at surgery was stained by immunohistochemistry for Lp-PLA(2) grade, macrophages, IL-6, C. pneumoniae and CD4+ and CD8+ cells. Serum Lp-PLA(2) activity and mass were measured using the colorimetric activity method (CAM) and ELISA, respectively. Serum homocysteine levels were measured by HPLC. Eleven (26.2%) patients were symptomatic with transient ischemic attacks. There was no correlation between patient risk factors (smoking, coronary artery disease, elevated cholesterol, diabetes, obesity, hypertension and family history of genetic disorders) for atherosclerosis and serum levels or plaque grade for Lp-PLA(2). Plaque Lp-PLA(2) correlated with serum homocysteine levels (p = 0.013), plaque macrophages (p<0.01), and plaque C. pneumoniae (p<0.001), which predominantly infected macrophages, co-localizing with Lp-PLA(2). CONCLUSIONS: The significant association of plaque Lp-PLA(2) with plaque macrophages and C. pneumoniae suggests an interactive role in accelerating inflammation in atherosclerosis. A possible mechanism for C. pneumoniae in the atherogenic process may involve infection of macrophages that induce Lp-PLA(2) production leading to upregulation of inflammatory mediators in plaque tissue. Additional in vitro and in vivo research will be needed to advance our understanding of specific C. pneumoniae and Lp-PLA(2) interactions in atherosclerosis
    • …
    corecore